Excess Clustering on Large Scales in the MegaZ DR7 Photometric Redshift Survey 
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We observe a large excess of power in the statistical clustering of Luminous Red Galaxies in the photometric 
SDSS galaxy sample called MegaZ DR7. This is seen over the lowest multipoles in the angular power spectra 
Ci in four equally spaced redshift bins between 0.45 < z < 0.65. However, it is most prominent in the 
highest redshift band at ~ 4<r and it emerges at an effective scale k < O.OlhMpc -1 . Given that MegaZ DR7 
is the largest cosmic volume galaxy survey to date (3.3 (Gpc h^ 1 ) 3 ) this implies an anomaly on the largest 
physical scales probed by galaxies. Alternatively, this signature could be a consequence of it appearing at the 
most systematically susceptible redshift. There are several explanations for this excess power that range from 
systematics to new physics. This could have important consequences for the next generation of galaxy surveys 
or the ACDM model. We test the survey, data and excess power, as well as possible origins. 
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Introduction - A galaxy survey contains vast and varied in- 
formation related to cosmological physics. The galaxies act 
as tracers of the underlying mass distribution whose statistical 
clustering enables a determination of the cosmological model. 
This is complementary to the CMB and directly probes the 
late-time Universe. Historically, this statistical distribution 
has been a penetrating indicator of new physics: The analysis 
of the galaxy correlation function in the APM survey showed 
one of the first signs that H m < 1 0]|2] - before supernovae. 

Since then galaxy surveys have matured with an empha- 
sis on large cosmic volumes, which probe new scales, and 
immense galaxy numbers. Due to limited resources one ap- 
proach has been to compensate this demand with a decrease 
in redshift precision. Rather than spectroscopy one instead 
obtains a redshift estimate based on the overall flux through 
broad band filters. This is called a photometric redshift. It has 
resulted in the leading MegaZ DR7 photometric catalogue [ 3 1 
and is the basis for the Dark Energy Survey Pfl . 

Any deviations from the current ACDM statistical profile 
could be the sign of new physics emerging over the largest 
scales in the Universe. Alternatively, any signatures could be 
systematics that affect the photometric method and therefore 
future projects. Detecting these systematics is vital in not only 
avoiding a biased inferred cosmology but in the planning of 
surveys. 

In this letter we highlight anomalous signatures in the new 
MegaZ DR7 galaxy angular power spectra given by excess 
power over large scales probed in the late-time Universe. We 
test the survey, excess power, possible systematics and high- 
light a subset of potential theoretical explanations. 

MegaZ - The MegaZ DR7 survey [3j is a new catalogue of 
Luminous Red Galaxies (LRGs) based on the final Sloan Dig- 
ital Sky Survey (SDSS) II photometric release Q. It is an up- 
date of the previous DR4 catalogue given by J6l 13 . Covering 
almost one fifth of the entire sky it includes 723, 556 LRGs 
in four equally spaced redshift bins with width Az = 0.05 
between 0.45 < z < 0.65. 

We performed a spherical harmonic analysis of the galaxy 
distribution [3 1 by determining the theoretical angular power 
spectra, 



Cf =< S 2D S* 2D >= 4tt / A 2 {k)W l {k)W 3 {k)^- (1) 

for all four bins, where A 2 (fc) is the dimensionless power 
spectrum. This is the projected power spectrum with win- 
dow functions given by Wi(k) for bin i under consideration. 
This is further detailed by Wi(k) = J f(z)ji(kz)dz and 
f(z) = n(z)D(z)(^), with the spherical Bessel function 
ji(kz), the linear growth factor D(z), comoving coordinate x 
and the normalised redshift distribution n(z). Redshift space 
distortions act to alter the shape of C( and we include this as 
described in [3 8 9]. The galaxies' photometric redshifts are 
determined with the ANNz code ifTOl , using the spectroscop- 
ically and photometrically defined 2SLAQ survey lITTl as a 
representative training set as in lfl2l . This was found to give 
the best photometric redshift estimate compared to template 
based methods 131 1121 . The LRGs provide reliable photomet- 
ric redshifts given that they are old, red, elliptical systems with 
a stable spectral energy distribution and sharp 4000A break. 
Furthermore, due to their high luminosity, they probe a vast 
region of cosmic volume and therefore the largest of physical 
scales. The measured Ci are determined using, 
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where a; TO are the harmonic coefficients of the projected dis- 
tribution and Aft and A" correspond to survey area and num- 
ber of galaxies, respectively. The coefficient corrections are 
given as, 

/!,«=/ y* m dn Jl, m = [ \YLm\ 2 dn. (3) 
J Ail ' JAQ 

Due to statistical isotropy the C;, m are averaged over (2£ + 1) 
a; TO values. These points are further binned into multipole 
bands of width A£ = 10 to decorrelate the data; a conse- 
quence of the partial sky coverage and subsequent convolu- 
tion. Further details can be found in (3]|7][T3L m addition to 
the full DR7 data in [3]. Moreover, the constraining potential 
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FIG. 1: The Angular Power Spectra Ci measured in the SDSS pho- 
tometric MegaZ DR7 Luminous Red Galaxy survey. The panels re- 
late to four redshift bins with width Az = 0.05 from z = 0.45 to 
z = 0.65. The best fit theoretical spectra (solid lines) are excellent 
matches to the data including multipoles up to I ~ 500. However, 
the largest angular scales are observed to be anomalous; the dashed 
lines correspond to 1 — > 4 a derived from simulations. This is partic- 
ularly severe in the highest redshift bin (main panel), which is ~ 4<r. 



of this powerful data set has been shown for neutrino masses 
in HI. 

Excess Power - The measured Ci for the four redshift bins 
are shown in Figure [TJ This illustrates the excess power over 
the lowest multipole bands compared to the rest of the data 
and the best fit theoretical profile (solid line). It is particularly 
prominent in the highest redshift bin (0.6 < z < 0.65; main 
panel). The model error bars in this plot have been assigned 
using Equation [4] This accounts for the expected shot noise, 
survey area and cosmic variance. 
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In this way the excess discrepancies in each bin are seen to 
be severe. In order to more accurately quantify the anomalies 
we reconstruct 3 x 10 4 Gaussian realisations of the galaxy 
field from the best fit Cg&, We account for cross correlations 
between the redshift bins due to photometric uncertainty and 
impose the same DR7 mask, pipeline and measured galaxy 
number on the simulations. We successfully reconstructed the 
smooth input cosmology from these realisations for £ > 4 (our 
cut due to the partial sky) and we also reproduced the previ- 
ous MegaZ results. This demonstrated the reliability of the 
statistic in relation to the survey geometry and the treatment 
of shot noise. We then used the variance of these realisations 




FIG. 2: A visualisation of the measured underlying field within the 
surveyed region on the sky. Only contributions from I = 4 — > 10 in 
the most anomalous multipole and redshift band are included. 



to obtain a measure of the excess power uncertainty. The cor- 
responding significance levels (1 — > 4cr) are included as the 
dashed lines in Figure [TJ Just two of the realisations were 
as anomalous as the measured value in the lowest multipole 
band in the highest redshift bin. This gives a significance of 
~ 4cr. Similarly, the lowest bands in the other bins are dis- 
crepant by ~ 2a (bin 1), ~ 2a (bin 2) and ~ 2.5a (bin 3). 
With slightly less significance ~ la excesses are observed for 
the second multipole bands in bin 3 and bin 4 too. The effect 
from including or excluding the excess power in the inferred 
cosmological constraints can be sizeable as shown in Figures 
8 and 11 from Q. It is interesting that slight hints of excess 
power have also been alluded to in Q Q3] with P(k) and 
Ce statistics. 

To assess potential contributions to the most anomalous of 
the £ bands we reconstruct the underlying matter distribution 
implied by that data using HEALPlX |[T6l . The correspond- 
ing a^ m — > matter visualisation can be observed across the 
surveyed region in Figure [2] No clear pattern can be seen that 
would suggest the presence of an obvious systematic across 
the sky. I.e., there does not seem to be spurious contributions 
towards the edges of the survey, closer to the Galactic plane or 
concentrated in one surveyed region. However, we now strive 
to quantify candidate sources of contamination. 

Star-Galaxy Separation - We selected the LRGs from the 
photometric sample using the selection criteria described thor- 
oughly in J5] |6] H QT). This was shown to be successful 
except for a 5% M-star contamination. We acted to remove 
these objects, which vary across the Galactic plane, by impos- 
ing a cut on the star-galaxy separation parameter from ANNz 
(S sg > 0.2). As described in [6| this ensures that the con- 
tamination is minimised without losing too many real galax- 
ies (quantified by comparing with the spectroscopic 2SLAQ 
sample). However, even with this cut the furthest redshift bin 
might still be affected given that it contains fewer natural ob- 
jects. To test this we measure the cross correlation between 
M-star objects removed from the catalogue (5 sg < 0.2) and 
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FIG. 3: Left Panel: This is the cross correlation between galaxies in the highest redshift bin (0.6 < z < 0.65) and M-stars (main plot). No 
signal is observed over the profile and even the lowest multipole band is consistent with la. In addition, more severe star-galaxy separation cuts 
are found to give no change in the Ce relative to the standard S sg > 0.2 cut (inset). The cuts correspond to remaining M-star contaminations of 
5% (S sg > 0), 1.5% (S sg > 0.2), 1.2% (5 sg > 0.4), 0.8% (S B g > 0.6) and 0.5% (S sg > 0.8). Right Panel: The cross correlation between the 
same galaxies and Galactic extinction is again mostly inconclusive (main plot). For a further test we removed surveyed regions corresponding 
to high extinction (> 0.1 mag; 15% of the survey area) and found no change to the measured spectra over the largest scales (inset). 



galaxies within the range 0.6 < z < 0.65. This is shown in 
the left panel of Figure [3] 

We assign error bars using a generalisation of Equation [4] 
with the measured star and galaxy auto power spectra C| 
and Cf as inputs. We also derive errors by cross correlat- 
ing the aforementioned Gaussian simulations with the M-star 
distribution with no change in the conclusion: Overall there 
is no correlation between the two samples with the lowest 
multipole band consistent at la. Even though the correla- 
tion is not significant we go further and repeat the whole 
galaxy clustering measurement with a series of more aggres- 
sive (8 sg 0.2) star-galaxy separation cuts. The difference 
Ci(6 sg = 0.2) — Ce(S sg = x) for the most anomalous band is 
illustrated as a function of star-galaxy cut in the inset of Fig- 
ure [3] We find no change over the scales of interest, implying 
minimal contribution from stars. The plot does highlight the 
importance of the initial cut however given the change when 

Ssg = 0. 

Extinction - Regions of high Galactic extinction could 
cause galaxies to be scattered from the sample as a function 
of sky position. The colour and magnitude cuts made on the 
photometric catalogue have been performed using extinction 
corrected model magnitudes. However, it could be that they 
contain errors that propagate into the analysis. With this in 
mind we therefore measure the cross correlation of galaxies 
and the extinction field. Similarly no clear signal is detected as 
seen in the right panel of Figure[3] The lowest multipole band 



appears slightly suggestive but is not significant given the in- 
creased uncertainty from cosmic variance. To test this further 
we remove regions in the selection function corresponding to 
high extinction (> 0.1 mag) and repeat our analysis. This 
removes w 15% of the survey area and regions concentrated 
mainly at the edges of the survey geometry. We find this has 
a negligible effect on the overall Ci profiles, including the 
largest scales, as seen in the same figure. 

We also repeated the clustering measurement analysis with 
the DR6 redshift catalogues from [12]. These galaxies rep- 
resent only a 1% reduction in area but have redshifts derived 
using a variety of template based procedures. This allows us 
to examine the extrapolation of the ANNz-2SLAQ training set 
with sky position. This is because the training set is limited to 
a narrow stripe within a limited region of the sky (and there- 
fore extinction, for example) whereas the template based pro- 
cedures are effectively blind to this calibration issue. We find 
no changes over large scales, which is consistent with the pre- 
liminary redshift tests performed in fl2l . 

Other systematics could remain and contribute to the Cg 
measurement from an improper estimation of the selection 
function. Examples include variations in seeing, photomet- 
ric calibration, over-lapping survey stripes and regions of low 
Galactic latitude. However, the previous MegaZ JT] and pho- 
tometric study of [9] tested their profiles against these afore- 
mentioned effects and found no significant change across any 
scale. 
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Alternative Models - Although more speculative a number 
of physical theories could produce a signature similar to our 
observed feature. For example, a modification to gravity or 
dark energy clustering ifTTl would give rise to an amplified 
signal over large scales. In particular changes to gravity would 
be enhanced within the statistic through redshift space distor- 
tions, which act to alter multipoles I < 50 in the Ci |3][8][9]. 
I.e., the distortions are sensitive to changes in din 8/ din a. 
Moreover, it would be interesting to see whether a complete 
non-linear treatment of redshift space distortions could influ- 
ence these multipoles further. Some studies have also argued 
that large scale inhomogeneity or voids can give rise to the 
observed accelerated expansion. One would expect some al- 
teration to the Ce when the scales probed impinge upon any 
transition. However, the analysis would have to be altered for 
a comparison to that framework. In addition, significant non- 
Gaussianity from an exotic inflationary scenario is capable of 
causing an increase in biasing over large scales. Indeed, pho- 
tometric surveys may prove to be one of the best methods to 
constrain non-Gaussianity in this way [ 18 1. 

Naturally, any proposed explanation for this effect must be 
consistent with other probes and data. For example, alter- 
ations to the growth over large scales must be consistent with 
the CMB through the ISW effect. However, it is interesting 
that there are debates regarding various anomalies there too 
|[T9l [20 1 . Likewise, some analyses already imply tension with 
void-like models for acceleration [21 1. Finally, it would be 
intriguing to see if the large amplitude implied in the power 
spectrum is similar to that which can produce bulk flows in 
velocity surveys l22l . 

Conclusions - Using the largest ever galaxy survey we find 
an excess of clustering in the angular power spectra of Lu- 
minous Red Galaxies (LRGs) that is not predicted by stan- 
dard cosmology. This is evident over low multipoles/large 
scales, particularly in the highest redshift bin. This could be 
the consequence of systematics that affect the furthest, faintest 
and photometrically least reliable galaxies. Alternatively, this 
could be the sign of new physics over the largest scales probed 
by the survey, which would occur at the highest redshift. 

We tested the survey, statistics and quantified the anoma- 
lous feature. We found no evidence to suggest it was caused 
by systematics from Galactic extinction, survey geometry, 
star contamination or the redshift estimation method. Ex- 
cess power was measured in all four bins but was ~ 4er at 
the highest redshift (0.6 < z < 0.65). Furthermore, any 
boost to the power spectrum emerges at an effective scale 
k < O.OlhMpc- 1 or A > 700 h" 1 Mpc. We concluded by 
highlighting a number of theories that could give rise to this 
effect, such as exotic dark energy, modified gravity, changes to 
redshift space distortions, large scale inhomogeneity or non- 
Gaussianity. 

It will be fascinating to see if large volume spectroscopic 
surveys, such as BOSS, observe such anomalies in the future 
given the different dependence on systematics. However, one 
might expect methodologically similar photometric surveys 



like the Dark Energy Survey to observe these features again. 
Clearly, even more effort to understand factors affecting com- 
pleteness or to estimate a non-discrete selection function will 
be invaluable. 
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